PyTorch入门:为何张量至关重要
PyTorch 是一个高度灵活、动态的开源框架,广受深度学习研究和快速原型开发的青睐。其核心是 张量 ,它是不可或缺的数据结构。它是一种多维数组,专为高效处理深度学习模型所需的数值运算而设计,并支持自动的 GPU加速 。
1. 理解张量结构
在 PyTorch 中,每一个输入、输出和模型参数都封装在张量中。它们的作用与 NumPy 数组相同,但针对专用硬件(如 GPU)进行了优化,使其在执行神经网络所需的大型线性代数运算时效率更高。
张量的关键属性包括:
- 形状:定义数据的维度,以元组形式表示(例如,一批图像的形状为 $4 \times 32 \times 32$)。
- 数据类型:指定存储元素的数值类型(例如,模型权重使用
torch.float32,索引操作使用torch.int64)。 - 设备:指示物理硬件位置,通常是
'cpu'或'cuda'(NVIDIA GPU)。
动态图与自动求导
PyTorch 使用命令式执行模型,意味着计算图是在操作执行过程中逐步构建的。这使得内置的自动微分引擎 Autograd可以跟踪张量上的每一步操作,前提是设置了属性
requires_grad=True 。这使得反向传播过程中梯度的计算变得非常简单。
TERMINALbash — pytorch-env
> Ready. Click "Run" to execute.
>
TENSOR INSPECTOR Live
Run code to inspect active tensors
Question 1
Which command creates a $5 \times 5$ tensor containing random numbers following a uniform distribution between 0 and 1?
Question 2
If tensor $A$ is on the CPU, and tensor $B$ is on the CUDA device, what happens if you try to compute $A + B$?
Question 3
What is the most common data type (dtype) used for model weights and intermediate calculations in Deep Learning?
Challenge: Tensor Manipulation and Shape
Prepare a tensor for a specific matrix operation.
You have a feature vector $F$ of shape $(10,)$. You need to multiply it by a weight matrix $W$ of shape $(10, 5)$. For matrix multiplication (MatMul) to work, $F$ must be 2-dimensional.
Step 1
What should the shape of $F$ be before multiplication with $W$?
Solution:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code:
F_new = F.unsqueeze(0) or F_new = F.view(1, -1)Step 2
Perform the matrix multiplication between $F_{new}$ and $W$ (shape $(10, 5)$).
Solution:
The operation is straightforward MatMul.
Code:
The operation is straightforward MatMul.
Code:
output = F_new @ W or output = torch.matmul(F_new, W)Step 3
Which method explicitly returns a tensor with the specified dimensions, allowing you to flatten the tensor back to $(50,)$? (Assume $F$ was $(5, 10)$ initially and is now flattened.)
Solution:
Use the
Code:
Use the
view or reshape methods. The fastest way to flatten is often using -1 for one dimension.Code:
F_flat = F.view(-1) or F_flat = F.reshape(50)